A Post by Michael B. Spring
Bookmarks and Meaning(December 15, 2007)
Social bookmarking systems provide a new source of infomration
about resources. In this post, I try to set out some conceptual
views of social bookmarking as a mechanism for asking what might be
derived from an analysis of social bookmarks. The delicious system
works as follows:
- A user posts a url
- To save the URL, the user must describe it -- this could be
defaulted to a title, but it may be more bookmarker centered than page
author centered
- The user may add user notes and tags
- The user may decide not to share the bookmark, making it private
With this in mind, at the very least, a social bookmarking system
would include a triple that consists of a URLID=normalized URL,
a USERID, a DESCription, OPTTag(s), OPTNotes, and SHARE(default TRUE).
A conceptual table such as this has the potential to provide the following
information:
- The number of URL's that have been recorded
- The number of users of the system
- The number of user-URL's that are marked private
- The number of user-URL's that are shared
- The number of URL's that are tagged
- The number of user-URL's that have user notes
For users, we can determine the following information
- The minimum, maximum, average, median number of total, shared, and
private URLs/user
- Various measures of the variance in the total, shared, and
private URLs across users
- The minimum, maximum, average, median number of tags/user
- Various measures of the variance in the number of tags across users
- The minimum, maximum, average, median number of descriptions/user
- Various measures of the variance in the number of descriptions
across users
For URLs, we can determine:
- The minimum, maximum, average, median number of total, shared, and
private users/URL
- Various measures of the variance in the total, shared, and
private URLs across URLs
- The minimum, maximum, average, median number of tags/URL
- Various measures of the variance in the number of tags across
URLs
- The minimum, maximum, average, median number of unique tags/URL
- Various measures of the variance in the number of unique tags
across URLs
Beyond these measures we can examine a number of issues
- Looking at tags, ordered by frequency of occurrence:
- are there obvious groupings of types of tags(semantic,
affective, personal)
- do the most frequently occurring tags tell us anything
about the collection
- are there patterns in the cooccurence of tags -- that is, for
some threshold of frequency of co-occurence across URL's, is there a clear
relationship between the co-occuring terms that allows us to simplify or
clarify the tagging. Does the same hold for low co-occurence terms -- i.e.
can we say some things about the terms.
- Is it possible to develop a tag map that would work as follows:
take the n most frequently occurring terms and set them around the
circumference of a circle. Take any term that co-occurs with one of those
terms more than x%(e.g. 90%) of the time and bundle it with the more
frequently occurring term. (If this was one of the original n, add a new n
to the circle.) Take terms that co-occur 50-90% of the time and place them on
strings proportionally distant from the terms they co-occur with. If they
co-occur with two or three terms on the circle, web them such that they are
proportionally distant from all the terms. If they only occur with one
term, fan them outside the circle proporionally distant from the term. What
kind of term map does that provide -- how might it be improved?
- When we look at tags by users,
- can we identify communities of interest? (common
frequently occurring tags)
- can be identify expertise (high number of URLs with l
evels of commonly used tags)
There are surely many more questions that we might try to answer and
there are surely more formal ways of formulating what might be inferred.
I will be returning to this entry in the coming months and trying to add
more thoughts about this.